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HIGH-SPEED TURBO DECODER 
BACKGROUND OF THE INVENTION 

The present invention relates generally to error-correction 
coding and, more particularly, to a decoder for parallel 
concatenated codes, e.g., turbo codes. 

A new class of forward error control codes, referred to as 
turbo codes, offers significant coding gain for power limited 
communication channels. Turbo codes are generated using 
two recursive systematic encoders operating on different 
permutations of the same information bits. A subset of the 
code bits generated by each encoder is transmitted in order 
to maintain bandwidth efficiency. Turbo decoding involves 
an iterative algorithm in which probability estimates of the 
information bits that are derived for one of the codes are fed 
back to a probability estimator for the other code. Each 
iteration of processing generally increases the reliability of 
the probability estimates. This process continues, alternately 
decoding the two code words until the probability estimates 
can be used to make reliable decisions. 

The maximum a posteriori (MAP) type algorithm intro- 
duced by Bahl, Cocke, Jelinek, and Raviv in "Optimal 
Decoding of Linear Codes for Minimizing Symbol Error 
Rate", IEEE Transactions on Information Theory, March 
1974, pp. 284-287, is particularly useful as a component 
decoder in decoding parallel concatenated convolutional 
codes, i.e., turbo codes. The MAP algorithm is used in the 
turbo decoder to generate a posteriori probability estimates 
of the systematic bits in the code word. These probability 
estimates are used as a priori symbol probability estimates 
for the second MAP decoder. Three fundamental terms in the 
MAP algorithm are the forward and backward state prob- 
ability functions (the alpha and beta functions, respectively) 
and the a posteriori transition probability estimates (the 
sigma function). 

It is desirable to provide a turbo decoder which efficiently 
uses memory and combinatorial logic such that the structure 
thereof is highly streamlined with parallel signal processing. 
It is further desirable to provide such a structure which is 
amenable to implementation on an application specific inte- 
grated circuit (ASIC). 

BRIEF SUMMARY OF THE INVENTION 

A high-speed turbo decoder utilizes a MAP decoding 
algorithm and comprises a streamlined construction of func- 
tional units, or blocks, amenable to ASIC implementation. 
The turbo decoder comprises a gamma block, alpha and beta 
blocks, and a sigma block. The gamma block provides 
symbol-by-symbol a posteriori state transition probability 
estimates (values of the gamma probability function), only 
four non-zero gamma probability function values being 
possible at any particular trellis level. Two gamma prob- 
ability function values are provided via selection switches to 
the alpha and beta blocks for calculating the alpha and beta 
probability function values, i.e., performing the alpha and 
beta recursions, respectively, in parallel, thus significantly 
increasing decoding speed. The alpha and beta blocks have 
as many state update circuits as there are states in the trellis. 
A scaling or normalization circuit monitors the values of the 
alpha and beta probability functions and prescribes a scale 
factor such that all such values at a trellis level remain within 
the precision limits of the system. Previously calculated 
values of these probability functions are used for the nor- 
malization calculation in order to remove the normalization 
calculation from the critical path in the alpha and beta blocks 
and thus increase decoding speed. The outputs of the alpha 
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and beta blocks are buffered and provided as inputs to the 
sigma block. The sigma block determines the a posteriori 
state transition probability estimates (sigma values) and uses 
the sigma values to provide the a posteriori bit probability 
5 estimates, i.e., the soft-decision outputs of the turbo decoder. 

BRIEF DESCRIPTION OF THE DRAWINGS 

FIG. 1 is a block diagram illustrating the general structure 
of a turbo decoder employing a MAP decoding algorithm; 
10 FIG. 2 is a block diagram illustrating a gamma calculator 
of a turbo decoder according to a preferred embodiment of 
the present invention; 

FIG 3 is a block diagram illustrating an alpha and beta 
15 block of a turbo decoder according to a preferred embodi- 
ment of the present invention; 

FIG. 4a is a block diagram illustrating an alpha update 
circuit of FIG 3 according to a preferred embodiment of the 
present invention; 
20 FIG 4b is a block diagram illustrating a beta update 
circuit of FIG. 3 according to a preferred embodiment of the 
present invention; 

FIG. 5a is a block diagram illustrating a sigma calculator 
of a turbo decoder according to a preferred embodiment of 
25 the present invention; 

FIG 5b is a block diagram illustrating a 2-Sums-and- 
Log-Addition Block 66 for the sigma calculator of FIG 5a 
in more detail; 

3Q FIG 5c is a block diagram illustrating a logarithmic adder 
(i.e, log-addition block 76 of FIG 56 and log-addition 
blocks 67, 68 and of FIG. Sa) in more detail; 

FIG. 6 is a block diagram illustrating the data flow for a 
turbo decoder according to a preferred embodiment of the 
35 present invention; 

FIG. 7 is a block diagram illustrating a gamma block 
according to preferred embodiments of the present inven- 
tion; 

FIG Sa is a block diagram illustrating a circuit for 
40 updating alpha and beta recursions according to a preferred 
embodiment of the present invention; 

FIG. Sb is a block diagram illustrating a soft limiter 
function suitable for use in the update circuit of FIG Sa; 
FIG 9 is a block diagram illustrating an alternative 
45 embodiment of the alpha and beta recursion update circuit of 
FIG 8; 

FIG 10 is a block diagram illustrating another alternative 
embodiment of the alpha and beta recursion update circuit of 
FIG. 8; 

FIG 11 is a block diagram illustrating one embodiment of 
the calculation of the alpha and beta recursion, including 
calculation of the normalization factor and calculations of 
the alpha and beta values as part of the alpha and beta 
55 recursion update circuitry; and 

FIG 12 is a block diagram illustrating an alternative 
preferred embodiment of the calculation of the normaliza- 
tion factor. 

DETAILED DESCRIPTION OF THE 
60 INVENTION 

Turbo Decoder Structure 

The MAP decoder uses the received sequence Y 2 T to 
65 estimate the a posteriori state and transition probabilities of 
a Markov source. 

PT{S r m\Y x ^?r{S r m;Y^}/ Pr^^On)/?^} 1 



US 6,304,996 Bl 



and 

Pr{S ( _ 1 -m , ;S,-m|Y 1 T }-Pr{S l . 1 -m , ;S / -m; Y l l }/Pr{Y 1 T }«o,(m»/ 
Pr{V) 2 

The joint probability estimates X/m) and aXm\m) are 5 
computed and normalized such that their sum adds to one, 
resulting in the desired state and transition probability 
estimates. 

The alpha, beta and gamma functions are set forth as 
follows: 



a f (m)-Pr{S r m; Y/} 
ft(m)-Pr{Y,^S l -m} 



and 
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^m)-a,(m)fr(m) 6 

and the a posteriori state transition probabilities are deter- 
mined as follows: 

x l (m , ,m)=a,_ l (m , )lf / (m , ,m)-p,(m) 7 25 

The alpha function is computed from the following recur- 



before being encoded by the bottom code. A random inter- 
leaver is usually preferred since the coding gain is higher 
with the same (interleaved) block length. 

FIG. 1 illustrates a turbo decoder employing component 
MAP decoders 12 and 14. As shown, the top code parity data 
is provided along with the systematic data to a top code 
memory 16 and then to MAP decoder 12. The systematic 
data is also provided, via an interleaver 18, along with the 
bottom code parity data to a bottom code memory 20 and 
then to the second MAP decoder 14. FIG. 1 also shows the 
feedback loop involving MAP decoders 12 and 14, inter- 
leaver 18, address generator 19, de-interleaver 22, and a 
probability estimate memory 24 for implementing a MAP 
decoding algorithm as described hereinabove. 

The systematic bit probability estimates are computed 
using the a posteriori transition or oXm',m) probabilities. 
The sum of all a posteriori transition probabilities corre- 
sponding to trellis branches which are labeled with the same 
data bit value is the a posteriori probability that such data bit 
value is the correct decoded bit value. The output of a MAP 
component decoder is an a posteriori probability estimate of 
the systematic symbols, denoted as APP^O) and APP/l), as 
set forth in the following expression: 

APP t {k) = Pr{d, =k\Y{) 11 



8 30 



The beta function is calculated using the following recur- 
sion: 

&{m)=Yj # + i (m') ■ r, + i (m, m ). 9 

m' 

Finally, the gamma function is defined as follows: 
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y,(m', m) = £p,(m|m'). \m\m)-R(Y, \ X\ 
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where pXm|m') are the Markov transition probabilities, and 
qXX|m',m) is the distribution of the source's output symbols. 

Turbo codes are constructed as two recursive systematic 
codes concatenated in parallel. A MAP decoder for a turbo 
code generates a probability estimate of the systematic bits 
in a code word, based on one of the two recursive systematic 
codes, and provides this information to a second MAP 
decoder which decodes the other component code of the 
turbo code. The second decoder uses these probability 
estimates as a priori information and generates new esti- 
mates of the systematic bits in the code word. The updated 
estimates are provided to the first MAP decoder, which in 
turn, generates updated estimates. This feedback process 
continues a finite number of times, and a decision on the 
systematic bits is made based on the final probability esti- 
mates. One decoding of each component code word com- 
prising a turbo code word is referred to as a decoding 
iteration; a typical number of iterations is eight. 

The two parallel codes in a turbo code are referred to 
herein as the top code and the bottom code. Normally, the 
data is encoded by the top code and is interleaved using 
either a fixed block interleaver or a random interleaver 



where the summation is over all a,(m',m) values where the 
systematic bit corresponding to the transition (m f ,m) is k. 

The MAP decoding algorithm is a memory intensive and 
computationally intensive algorithm due primarily to the 
alpha and beta functions. The alpha and beta functions are 
recursive operations which begin at opposite ends of the 
received sequence. Normally, the alpha function would be 
computed first; then the beta function and sigma function 
would be calculated. In preferred embodiments of the 
present invention, the alpha and beta function values are 
calculated in parallel. 

The alpha ftinction is defined by the following recursion: 
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where the summation is over all states where the transition 
(m',m) exists. 

and the beta recursion is defined as follows: 



50 



13 



where the summation is over all states where the transition 
55 (m,m f ) exists. The alpha and beta functions must be com- 
puted for all states (m) and for all trellis levels (t). 

For example, for systematic codes with binary input 
symbols, the number of terms in the summation is two; and 
for typical turbo codes, the number of states is sixteen, and 
60 the number of trellis levels is greater than two hundred. 
Assuming a sixteen-state trellis and eight iterations of 
decoding, a parallel multiplication followed by an addition 
function must be executed five hundred twelve times for 
each trellis level The decoding of each trellis level only 
65 yields a single bit of user information. 

Equations 12 and 13 indicate that the alpha (beta) recur- 
sions depend on the previous values of the alpha (beta) 
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recursion and the y r+1 (m',m)(gamma) function. The quantity When a specific value of the channel transition probability 

y, +1 (m'm) is the joint probability of state m at time t and of function is required, the pre-computed value can be read 

receiving symbol Y„ given that the state at time t-1 was m' from the table and provided to the gamma calculator. The 

and can be computed as memory required to store the one hundred twenty-eight table 

5 elements is small, and any performance loss due to sixty- 

^ , 4 four-level (six-bit) quantization of the inputs is minimal. 

Y ,(m ,m) = 2_ j p,(m\m)-q t (X\m.m). R(Y, \ X,) * ^ size Qf ^ ^ can ^ reduced tQ hdf of the gize 

described hereinabove by exploiting the symmetry of the 

channel probability functions about the value 0.5. To this 

In a recursive systematic code, a single transition (m',m) 10 end, a table containing the values R(y|l) and R(y|0) for only 

determines a specific channel symbol X, so that the sum- either y>0.5 or y<0.5. is required. Assuming that the 

mation is eliminated and the function qXX|m\m) is identi- received data is sampled symmetrically about the 0.5 value, 

cally equal to one. Also, for a rate one-half code and binary the size of each table contains only 32 values. And assuming 

signaling, the channel transition probabilities are computed that the tables are loaded with the channel probability values 

as the product of two one-dimensional transition probability 15 for y>0.5, and an input sample of a value less than 0.5 is 

estimates for the information and parity bits as follows: received, then the required R(y|l) and R(y|0) arey using the 

lookup tables to find the values R(0.5-y|l) and R(0.5-y|0) 

RCYK>R0gx,^R(Y,jx,;, is and using the relations R(y|l)-R(0.5-y|0) and R(y|0)-R 

„ (0.5 -y|l). The implementation of this reduction in the size of 

assummg a memoryless channel. \ . v, . « . . - 

lb. transition probability pXm|m) is zero for invalid 20 ^ 5 _f^^^^1Z£^ZT£ 
transitions and is otherwise equa. to the a prior bi, prob- ^Z^tZZs ^SS^S^ 

y * The total memory required in the probability tables can be 

further reduced by half by scaling the tables so that for each 

y t {m , m) = 0 when p,(m \m ) = 0 16 t r J f * . . , 

25 value of y, one of the R(y|l) or R(y|0) values is exactly unity. 
= AP,{0)R(Y t( \0)R(Y, p | x tp ) whcn(X rf \m\m) = 0 ~ In the case where the decoder is implemented in the log- 

domain (ln(l)*0), one of the values is identically zero. With 
= AP t (\)R{Y,. | \)R(Y, p \x tp ) when(X,. |m\m) =1 the scaling of tables described hereinabove, the table con- 

taining the probability values for R(y|l) will contain all ones 
_ „ v . , . . ... - . . , . ™ (zeros for the log-domain decoder). This table can then be 

where AP^k) is the a pnon probability for the systematic bit eliminated from the channel probability calculation circuit, 
at trellis level t. X, p can assume only a 0 or 1 value so that, ^ mtroduced t0 lne tables does not ^ GC{ deco ding 

at any trellis level, there can be only four possible non-zero performance because the decoder makes bit probability 
values for gamma. estimates based on the ratios of probabilities and not on the 

Wn>MP^>R(Yjo>RtY f |o) 35 absolute probabilities 

* ' In the case where the channel probability distnbution is 

Y lt0 i(m , ,m)=AP,(o) R(Yjo>R(Y v |i) Gaussian and the decoder is implemented in the log-domain, 

it is possible to replace the above mentioned lookup table 
Yuo(m»-APXl) R(yJl>R(Y,JO) with a multiply circuit. 

Yui (m»-AP t (i) R(Y l |i>R(Y, |i) 40 ^ ratio of R (yl x ) and R (yl°) m lhe log-domain can be 

9 computed as 

For a logarithmic implementation, equations 17 are 

rewritten as follows: 



lnY^m'.nO-ln AP,(0)+ln R(YjO>ln R(Yj0) ^ log [^T^] = log 

lnY r01 (m»-ln AP,(0)+ln R(Y,j0)+lii R(Yjl) 



1 c J^l) 



. <nf2n 

lnY uo (m».ln AP f (l)+ln R(Yjl>ln R(Yj0) = log [ cxp [0^i>! _ 

lnY U i(m»ln APXl)+ln R(Yjl)+ln R(Y,Jl) 



2cr* 2a- 2 
SO (y-1) 2 (y-Of 



These four gamma values depend only on the a priori bit la* 2a- 2 

probability and the received symbol for the trellis level. All (> 2 _ 2y + ^ _ ^ 

four values are used multiple times in the alpha and beta r " 2p- 

ecursions. FIG. 2 illustrates a gamma calculator circuit 30 f l\ 

for computing the gamma values set forth in equations 18. 55 \y+ ^ j 

As shown, the inputs to the gamma calculator are the = 
logarithms of the channel transition probabilities, R(Y,|0) , 

R(Y,11) , R(Y, 10) , R(Y, 11); and the logar ithms of the a ^ _ . . t 
prior! bit probabilities are'AP/0) and AP/1). FIG. 2 illus- ° Qe ^efficient in the multiplication is 
trates selection switches 32 and adders 34 for implementing 60 
equations 18. 

The computation of the channel transition probability 
function R( |) is performed using a look-up table. For 

example, if the input data Y, is quantized to sixty-four levels, while the (y+Vi) term represents the input data samples, 
a look-up table containing the one hundred twenty-eight 65 In a preferred embodiment of the invention, the gamma 
different values (64 for X f -0 and 64 for X,-l) of the function values from gamma calculator 30 are provided to circuits 
R( | ) can be computed in advance and stored in memory. used to compute the alpha and beta functions, i.e., alpha and 
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beta blocks 40 wherein the alpha and beta values for each multiplexer 50; and the other two logarithmic gamma prob- 

state in the trellis are calculated in parallel, as illustrated in ability function values are provided to a second selection 

FIG. 3. This parallel architecture significantly increases switch or multiplexer 52. As shown in FIG. 4a, the loga- 

decoding speed compared to calculating the alpha and beta rithmic alpha probability function values are provided to two 

values for each state in series since the alpha and beta 5 selection switches or multiplexers 54 and 56. Similarly, as 

recursions fundamentally limit the computation speed of the shown in FIG. 46, the logarithmic beta probability function 

MAP decoding algorithm. A single update of an alpha (or values are provided to the two selection switches. The 

beta) value involves both the multiplication of two previous outputs of switch 50 and switch 56 are provided to an adder 

alpha (or beta) values by the appropriate probability esti- 58. The outputs of switch 52 and switch 54 are provided to 

mates and the summation of these products. In general, these 10 an adder 60. The results from adders 58 and 60 are provided 

computations must be completed for all states in the trellis to a log-addition block 62, with the resultant sum being 

before the recursion for the next trellis level can start. Such combined in subtracter 64 with the output of normalizer 48, 

parallel processing requires as many state update circuits 42 as shown, the output of which is, in turn, fed back to a 

(or 43) (two parallel multiplications followed by an memory circuit and the normalizer. The outputs of selection 

addition) as there are states in the trellis. To minimize the 15 switches 54 and 56 are provided to sigma blocks for the 

delay caused by the alpha and beta recursions, the alpha and sigma calculations, as described hereinbelow. 

beta recursions are performed in parallel (as shown in FIG. The outputs of the alpha and beta blocks are buffered and 

6 described hereinbelow), requiring a separate computa- used as inputs to the sigma calculation (or sigma-AP) blocks 

tional block 40 for each. For example, thirty-two parallel 65, as illustrated in FIG. 5a. 

multiply-and-add circuits are needed in such a turbo decoder 20 The a posteriori bit probabilities are computed as the sum 

architecture for a sixteen state code. The parallel computa- of a number of sigma values as follows: 
tion of the alpha and beta functions effectively increases the 

decoding speed of the turbo decoder by a factor of thirty-two APPXk>i^(m , f m) f 19 

over a serial approach. where the sigma values are computed using the following: 

Each alpha and beta computational circuit requires two 25 & r & & 

values of the alpha (or beta) function at the previous time T / (m',m)-a,_ 1 (m')7 r (m',m>pXm) 20 
instant. The two previous values are dependent on the trellis 

of the code. If the code is recursive systematic with binary In calculating the a posteriori probabilities, it is desirable 

inputs, then there are two valid transitions into and out of to minimize the time for calculating the sigma values of 

each state. Furthermore, if the memory of the code is fixed, 30 equation 20 and the summation of equation 19. Since the 

then the two state probability estimates required for either alpha and beta recursions begin at opposite ends of the code 

the alpha or beta recursions are fixed. Since this connectivity word, there is insufficient information available to compute 

is known, the feedback portion of the alpha (or beta) circuit the sigma values until each recursion is half finished. At such 

can be hard-wired. In general, the alpha and beta circuits time, all of the sigma values which are functions of the alpha 

require different connections, however. 35 and beta values at trellis indices t^ and T^ can be calculated. 

Each of the alpha and beta computational circuits also For a sixteen-state recursive systematic code, there are 

requires two of the gamma values that have been calculated sixty-four such values. Fortunately, these can be grouped 

in the gamma calculator 30 (FIG. 2). The four possible naturally into four categories using equation 19. In a recur- 

gamma values from the gamma calculator are available to sive systematic code with sixteen states, there are sixteen 

each alpha and beta circuit 40 (FIG. 3). Selection of the 40 elements in the summation. Since the bit indices are either 

appropriate gamma value used in the alpha and beta circuits 0 or 1, if the trellis level index for the alpha recursion is t* 

is performed using selection switches TS(m Jc), represented and the trellis level for the beta recursion is T^, four circuits 

by number 44, and TP(mJc), represented by number 46. The can operate in parallel in order to compute the summation of 

switches TS(m,k) determine the systematic bit contribution; equation 19. TTie four summations simultaneously compute 

and switches TP(m,k) determine the parity bit. The switches 45 APP, r (0), APP ff? (l), APP, t (0) and APP f/ (l). 

TS(m,k) are also used to determine which of the two The sigma calculations also require gamma values. For 

hard-wired alpha (or beta) values of the previous trellis level this operation, there are two gamma calculators which 

are multiplied by the selected gamma values in the current calculate and supply the sigma-AP block with the four 

trellis level update. possible gamma values for the trellis indices t^ and T^ 

The alpha and beta blocks also include a normalization 50 Again, matching of the appropriate alpha, beta, and gamma 

circuit 48. The function of the normalization circuit is to values in equation 20 is performed with selection switches 

monitor the values of the alpha (beta) function and to assign TP(m,k) and TS(M,k) described hereinabove, 

a scale factor such that all values at a trellis level remain FIG. 5a illustrates a logarithmic implementation of a 

within the precision limits of the particular system. The sigma-AP block 65. In particular, a logarithmic implemen- 

normalization function is preferably implemented in such 55 tation of equation 20 requires two sums. Then, there are 

manner that the computation of each normalized value is fifteen pipe-lined logarithmic adders to perform the summa- 

performed in parallel with the alpha (beta) calculation, as tion of equation 19. Structurally, each block 66 includes one 

described hereinbelow. log-addition function, eight additions thus being performed 

The initialization circuit involves setting the initial state in parallel. Blocks 66 are followed by two pairs of parallel 

probabilities to known values at the start of the recursions. 60 log-addition blocks 67, the output of each pair being pro- 

The convention used herein is that state 0 has probability 1 vided to another log-addition block 68. The outputs of the 

upon initialization; all other states are initialized with prob- two log-addition blocks 68 are then provided to the fifteenth 

ability 0. log-addition block 69. 

FIG. 4a illustrates one embodiment of alpha recursion FIG. 5b illustrates a 2-Sums-and-Log- Addition block 66 
update circuit 42; and FIG. 4b illustrates one embodiment of 65 of FIG. 5a in more detail. The present alpha or beta function 
beta recursion update circuit 43. Logarithmic gamma prob- value and the corresponding value from memory are pro- 
ability function values are provided to a selection switch or vided to a summer 70. The output of summer 70 is provided 
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to a second summer 71 along with the appropriate gamma 
value chosen by multiplexer 72. Similarly, in the illustrated 
lower path, the present alpha or beta function value and the 
corresponding value from memory are provided to a summer 
74, the output of summer 74 being provided to another 5 
summer 75 along with the appropriate gamma value chosen 
by multiplexer 73. The outputs from summers 71 and 75 are 
provided to a log-addition block 76, the output of which is 
clocked by a register 77. 

FIG. 5c illustrates a log- addition block suitable for imple- 10 
mentation as log-addition block 76 of FIG. 5b and also as 
blocks 67, 68 and 69 if FIG. 5a. With respect to block 76, 
for example, the outputs of summers 71 and 75 are provided 
as inputs INI and IN2, respectively, to block 76. The 
difference between inputs INI and IN2 is determined by a 15 
comparator 78, the output of which is provided to an 
absolute value function block 80. The output of absolute 
value function block 80 is provided to a log-addition look-up 
table block 81. A multiplexer 79 also receives inputs INI and 
IN2 and selects the appropriate input for addition in adder 82 
to the output of the look-up table block 81 . The resultant sum 20 
is the output of the log-addition block. 

FIG. 6 is a top level view illustrating data flow in the turbo 
decoder 10. Each block, or functional unit, has a specific 
function to perform and can be built and tested as a separate 
unit. The blocks within the turbo decoder in the data path are 25 
the gamma block 90, the alpha and beta blocks 40, and the 
sigma-AP blocks 65, as described hereinabove. 

The gamma block 90 includes data interfaces to the user. 
The gamma block contains the four gamma calculator 
circuits 30 (FIG. 2). The gamma block also has sufficient 30 
memory to store the received samples for the code word and 
to store a calculated a posteriori probability for each data 
symbol (i.e., systematic bit) within the code word. 

FIG. 7 illustrates gamma block 90 in more detail. IP cells 
92 and 94 convert received symbols (IPDAT) from the 35 
channel into the negative of the logarithm of each respective 
channel transition probability. The other illustrated input, 
SNR (signal-to-noise ratio), is a parameter that selects one 
of four tables, for example, which implement the function of 
the block. An AP cell 96 receives as inputs the outputs of the 
sigma block. The AP cell takes the difference of the inputs 
and forms the log likelihood ratio of the bit value. The log 
likelihood ratios are stored in the AP cell. The outputs of the 
AP cells are as follows: the transformed inputs (i.e., the 
negative of the logarithms of the two input probabilities); the 
sign bit DBIT (i.e., the hard decision output of the decoder); 45 
and the soft-decision output APOUT (i.e., the log likelihood 
ratio). The outputs of the IP cells and the AP cell are then 
utilized by the gamma calculator circuits 30, as described 
hereinabove with reference to FIG. 2. 

Referring back to FIG. 6, the alpha and beta blocks 40 50 
calculate the alpha and beta vectors in the turbo decoding 
algorithm. In particular, as shown, there is a separate block 
for computing the alpha functions and a separate block for 
computing the beta functions. Computations of the alpha 
and beta values involve recursive operations, each beginning 55 
at one end of the component code word and continuing until 
the other end. The difference between the alpha and beta 
calculations is that they begin at opposite ends of the code 
word. The recursion is defined by the trellis of the channel 
code. The recursions for the alpha and beta blocks are 
slightly different because the trellis appears different 60 
depending on the direction in which it is traversed; that is, 
the connections are not symmetrical about a set of nodes. 
The inputs to the alpha and beta blocks are the gamma 
probability function values (i.e., the symbol-by-symbol a 
posteriori state transition probability estimates), which are 65 
generated in the gamma block 90. The outputs of the alpha 
and beta blocks are the alpha and beta vectors. The alpha and 
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beta vectors are required in the sigma blocks 65. The alpha 
and beta blocks contain enough memory to store the alpha 
or beta vectors for half the code word. 

There are four sigma blocks 65 which calculate the sigma 
values for the transitions in the trellis. These blocks also 
compute an update of the a posteriori probability associated 
with each of the data (systematic) bits in the code word. The 
probability of the transmitted bit being zero is computed 
simultaneously with the probability of the bit being one. The 
right-hand and left-hand sides of the code word are com- 
puted simultaneously. These operations are performed in 
parallel in order to minimize the delay otherwise due to 
serial sigma block calculations. The inputs to the sigma 
block are the gamma values, computed by the gamma block, 
and the alpha and beta vectors, computed in the alpha and 
beta blocks. 

Optimization of Critical Alpha/Beta Path 

In a recursive systematic code, only two of the y^m^m) 
values are non-zero; therefore, an update of either an alpha 
or beta value involves a parallel multiplication followed by 
an addition. Then, division by a normalization value ensures 
that the sum of all the state probabilities are maintained 
within the precision limits of the system. The basic opera- 
tions required for an update of the alpha recursion are set 
forth in the equation that follows: 



a / (m)-(ai-i(ni>Y ( (mpn')+a r _ 1 (m'>Y / (ni 3 m-))/r| ( 
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where Jt, is the normalization factor at trellis level t. The 
calculation of the normalization value T| r is data dependent 
. The normalization value is ideally is a function of all the 
state probabilities at time t However, such an implementa- 
tion would significantly limit decoding speed since the time 
taken to calculate r\ t would be added to the computation time 
of equation 21. Advantageously, a preferred implementation 
uses past values of the state probabilities to prescribe a 
current normalization value, thus removing calculation of 
the normalization value from the critical path circuit, as 
described hereinbelow. 

As indicated hereinabove, a logarithmic version of the 
MAP algorithm is preferred. Advantages of the use of 
logarithms include the following: (1) fewer bits of precision 
are required to obtain the same turbo decoder performance; 
and (2) multiplication becomes addition in the logarithmic 
domain. 

Atypical logarithm base useful in the log-MAP algorithm 
is 16 e . Some properties of logarithms useful in the log-MAP 
algorithm are: 



and 



ln(AB)-iii(A)+ln(B) ln(AjB)-ln(A)-ln(B) 22 

ln(A + B) = ln{expPn(/i)] + expPi<fl)]} 23 
= Intexp(lnM) ■ [I + exp[ln(*)] /expQn(/\)]]} 
= ln[exp(ln(/t)] + ln[l + exp[ln(fl) - ln(/4)]] 
= la(A) + ln[l + exp[-(lnM) - \n{B))]] 
= \n(B) + ln[l + exp[-(ln(fl) - 
= maxMA), HB)) + ln(l + exp[-|ln(/4) - 



The last line of equations 23 can be interpreted as per- 
forming a select-largest-value function, followed by a cor- 
rection factor that is dependent on a difference of two 
numbers. The correction factor may be implemented using a 
look-up table. Fortunately, with finite precision arithmetic, 
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there are only about sixty non-zero values for the function adder 114 in the bottom path. This change in circuit archi- 

ln[+exp[-|ln(A)-ln(B)|]] when the base of the logarithm is tecture is possible because addition is both commutative and 

16 ^e- associative. The alpha and beta recursion update circuit of 

The logarithmic equivalent of equation 21 is FIG. 9 is faster than that of FIG. 8 ecause adder 116 operates 

ln[aXm)]-max[ln(a I . 1 (mO>fin(TXm,m , )),in(a f _ 1 (in"))+ln(YXni, 5 in parallel with elements 104, 106, and 108 rather than in 

m"))]+ln[l+cxp(-|[in(a ( _ 1 (m , ))+lD(Y i (m > m , ))]-[ln(a / _ 1 (m-))+ series with them. 

ln( Y( (m,m"))I)]-ln(:i,) 24 Qther operations m the critical path i nvo lve data condi- 

F1G. 8a illustrates an alpha and beta recursion update tioning and limiting. Typically, the inputs to the circuit are 

circuit 100 for implementing equation 24. This circuit for eight-bit unsigned integers. When two eight-bit integers are 

calculating the max(ln(A),ln(B))+ln[l+exp[l-ln(A)-ln 10 added, a nine-bit integer results. Furthermore, since subtrac- 

(BA|]] function has two parallel paths. The top path involves tion requires signed numbers, conversion to a signed value 

multiplexers 101 for choosing two gamma values, summers is needed before the subtraction. For this case, the correction 

102 and 104, and an absolute value function 106 followed by look-up input must be converted to a six-bit value. Finally, 

a table look-up function 108. The bottom path involves the output of the circuit is converted back to an unsigned 

multiplexers 103 for choosing the alpha (A) and beta (B) 15 integer, and its value is soft-limited so that its maximum 

values, a summer 110, and a multiplexer 112. The two paths value is 2 8 -1, or 255. 

meet in a summer 114. Following this, normalization is Multiplexers 112 are used to select gamma function 

performed using a subtracter 116. The output of subtracter values, and multiplexers 103 are used to select feedback 

116 is provided to a soft-limiter function 118, which is paths. Advantageously, this allows the coder generator to be 

illustrated in more detail in FIG. 86. The alpha and beta 20 a programmable parameter of the turbo decoder described 

recursion updates are provided via register 120. herein. If the codes of both the top and bottom decoder are 

FIG. 9 illustrates an alternative embodiment of the circuit identical and are fixed, then the four multiplexers at the input 

of FIG. 8a. One difference between the circuits of FIG. 8a to the circuit can be eliminated, further increasing speed. In 

and FIG. 9 is the use of unsigned integer addition for adders any event, however, for flexibility, four 2:1 multiplexers 

102 and 110. Since MAP algorithm quantities are probability 25 may be included in the turbo decoder design, as illustrated, 

estimates, their values lie between 0 and 1, and the loga- An even greater speed improvement can be achieved by 

rithms of these probability estimates are bounded on the replacing the two's complement absolute value function 106 

interval [-o°, 0]. Consequently, the sign bit required for with a one's complement absolute value approximation, 

signed integer arithmetic can be discarded, and unsigned without degrading performance. 

arithmetic can be performed. This reduction in integer word 30 A further enhancement to the alpha and beta critical path 
size reduces the processing time for the addition cells, circuit that would increase decoding speed (but would also 
thereby increasing speed of operation of the circuit. It also increase circuit complexity) involves removing the multi- 
reduces the size of the multiplexers. plexers 103 which select the feedback path, as illustrated in 

Another difference in the circuit of FIG. 9 with respect to FIG. 10. Instead, multiplexers 124 are placed in the path of 

that of FIG. 8a is that the maximization (max) function has 35 the input gamma values in order to ensure that the proper 

been converted to a minimization (min) function because summations are performed. Since multiplexers are also 

negative numbers are represented as positive integers in this needed to route the alpha or beta values to the sigma-AP 

circuit. block for the sigma calculations, such multiplexers 126 are 

A further difference between the embodiments of FIG. 8a located after the summations with the gamma values, and 
and FIG. 9 involves the correction factor ln[l4exp[-|ln(A)- 40 hence are not within the critical path. Registers 120 are 
ln(B)|]]. The correction factor is always a positive value. But placed after multiplexers 124 in the gamma selection path, 
since numbers according to the log-MAP algorithm are The additional circuitry thus involves two additional multi- 
negative logarithms, this correction factor must be sub- plexers and two registers for each alpha and beta update 
traded from the output of the min function. Advantageously, circuit. The added registers increase the throughput delay of 
however, this subtraction operation has been converted to an 45 the entire data path, but this increase in throughput delay is 
addition function. In particular, the values in the table that minimal as compared with the saving in computational 
must be subtracted for the logarithmic base of 16 ^e range speed. 

from 0 to 11. If the table is biased by -11, then the values Still another speed improvement involves a modification 

that must be subtracted range from -11 to 0. Mathematically, of the output of the normalization function and the ln[l+ 

this is equivalent to loading the table 108 with the negative 50 exp[-|ln(A)-ln(B)|]] look-up table so that they can be com- 

of the correction value (a positive number) and performing bined easily prior to being subtracted from the output of the 

an addition. Since addition is faster than subtraction, a speed min multiplexer 127, as illustrated in FIG. 10. With these 

improvement results. The bias can be removed by biasing modifications, the bias in the look-up table can be removed 

the normalization value by 11. and the normalization output can be modified. This is 

The look-up table values for table 108 can be imple- 55 accomplished by truncating the normalization output to the 

mented as either a RAM (random access memory) or ROM nearest multiple of sixteen such that the least four significant 

(read only memory) cell. Alternatively, since the correction bits are zero. The addition of the four-bit output of the 

values are a monotonic function, they can be implemented correction look-up function does not require an adder. These 

using combinatorial logic rather than an actual table. Since modifications effectively remove an adder from the critical 

the number of non-zero values in the table (before biasing) 60 path circuit. In FIG. 10, the output of the normalization 

is less than sixty-four, the input to the circuit would only circuit is shown as a four-bit value instead of an eight-bit 

have six bits. Straight-forward logic implementation of the value because the four least significant bits are assumed to 

table could save both area and access time and thus further be zero. 

contribute to improved efficiency and performance in the Calculation of the normalization factor involves selecting 

critical alpha and beta path. 65 the lowest of the state probabilities at trellis level t-2 and 

The speed of the circuit has been improved further by normalizing the state probability estimates at time t by this 

locating the subtracter 116 for the normalization before the value. Previously calculated values of the state probability 
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estimates are used such that the normalization calculation as the normalization factor for -ln[a,(m)] and 
circuit is not in the critical path, as described hereinbelow. 

The bias that has been added to the correction function 
should not be subtracted from tj, since the bias is already 
present in the inputs to the circuit. During the cycles in 5 
which the output of the normalization calculation is not 
used, the bias can be used as the normalization value. as the normalization factor for -lnfa^m)]. That is, the 

negative of the logarithm of the alpha and beta probability 
Parallel Implementation of Alpha and beta function values are normalized after each recursion. Refer- 

Normalization 10 ring to FIG. 11, this involves calculating the logarithm of the 

sum of the alpha probability estimates at time t over all the 

A key element of a preferred embodiment of this turbo sta tes in log-addition block 48 (which comprises the nor- 
decoder invention is the normalization or scaling of the malization block 48) and adding the resultant sum to the 
alpha and beta functions' calculated values. This normal- negative of the logarithm of each alpha probability for trellis 
ization or scaling works together with soft limiting of the 15 i eve l t in subtracter 140. In this embodiment, the normal- 
alpha and beta functioas' values to significantly reduce the ization of the beta probability values is implemented in an 
dynamic range required for the decoding arithmetic in order analogous fashion. The probability values are now scaled 
to obtain the desired decoding performance. This reduction such that t he maximum values of -ln[aXm)] and -ln[pXm)] 
in dynamic range is manifested in a reduction in the required are between 0 and -ln[l/M], where M is the number of states 
word widths, i.e., number of bits representing various 2 o in the trellis. Unfortunately, however, since computation of 
numerical values in the implementation of the turbo decod- the normalization value is dependent on all the state prob- 
ing algorithm. ability estimates, the time required to compute this normal- 

The need for the scaling and soft limiting of alpha and ization value must be added to the time required to execute 
beta function values can be illustrated by an examination of equation 12 or 13. This added execution time can double the 
the definitions of these probability functions in equations 3 25 total time required to perform the alpha and beta update and 
and 4, respectively. The alpha probability function a,(m)- thus could potentially halve the decoding speed of the 
Pr{S -m;Y/} is the joint probability of the encoder being in decoder. 

state m at time t and receiving the t-symbol sequence Y/. In FIG. 11, adders 58 and 60 and log-addition block 62 

Note that have been described hereinabove with reference to FIG. 4a. 

30 As illustrated, normalization block 48 comprises a 16 -input 
Y a (m) = V Pr\s = m- y' } = Pr{ y 1 ) 25 log-addition block which computes the norm alization factor. 

„ « ' ' 1 In particular, block 48 comprises a tree-like structure with 

log-additions being done in pairs of log-addition blocks, 
exemplary log-addition blocks having structures such as 

Those with ordinary skill in the art will readily see that 35 D I° ck 76 of p I G - Sc - 

Pr{Y/} decreases as t increases. Since a/m) must be less As explained hereinabove, in a turbo decoder, the alpha 

than or equal to Pr{Y/} according to the equation above, the afl d beta probability estimates are used to compute sigma 

alpha function also must decrease as decoding progresses probability estimates, which, in turn, are used to compute the 

through the trellis (t increases). It is therefore obvious to a posteriori symbol estimates. The difference between the 

those with ordinary skill in the art that as the length of a code «o two (APPt(O) and APPt(l)) a posteriori symbol estimates is 

word increases so does the dynamic range of the alpha the log-likelihood ratio, which is the output of the MAP 

probability function. turbo decoder. Advantageously, because of the way that the 

„. . . . . ..... n a , v state probability estimates are used in the turbo decoder, it 

Similarly, since the beta probability ^ function PXm)- caQ £ shown mat jt is QOl ne t0 , nomal . 

£ 1 k } , 18 COn ^ ,tl ° Da ! ° f TT 8 « ^ti°n »ch that the state probability estimates sum to 

the (T-t)-symbol sequence Y given that the encoder state ^ ft h ^ essential ^ e re [ ative mtudes 

at time t is m, these conditional probability estimates A ' . . n U / A , ... ,JT" , . 

, . ' . , r ..... A constant scaling applied to either all the alpha state 

decrease as the backward recursion progresses through the babilit estimat * s 0 7 the beta state probabil i ty Limates 

trellis ( decreases), and the dynamic range of the beta ^ ^ ^ ^ , ratio ' output . 
probability function increases. 5Q In Ught of the ikct Hurt the value of tbe scale &ctor does 

Furthermore, the definitions of the alpha and beta prob- not affect decoding performance from an algorithmic point 
ability functions suggest that the following normalization be of view, the scale factor applied can be advantageously 
used as scaling to reduce the dynamic range of the decoding chosen to best match the dynamic range of the probability 
arithmetic. Since a/m) and p/m) are probability estimates, values to the integer number system utilized in the decoder. 
0 ^aXm)^l and 0 <PXm)iL This implies that 55 Because the most probable paths through the trellis are of 

greatest interest, a preferred embodiment of the invention 
Mm) ^ { ^ s j 26 uses the largest alpha function value at trellis level t as the 

Z»i<«) m SAM scale factor for all a£m) in alpha and beta block of FIG. 9. 

Similarly, the largest beta function value at trellis level t is 
60 used as the scale factor for all PX m ) in alpha and beta block 
in this embodiment of the invention. 

Assuming that the rate at which the alpha or beta prob- 
ability estimates are drifting toward zero probability is much 
smaller than the dynamic range of the number system used 
65 to store the alpha and beta probability estimates (which is 
typically the case), then it is possible to use past values (i.e., 
past trellis levels) of the alpha and beta functions to pre- 



Therefore, one embodiment of the invention uses 



